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I.  INTRODUCTION 


There  are  many  metrics  for  gauging  software  during  the  various  activities  of  development, 
but  each  leaves  questions  about  the  quality  and  maintainability  of  the  software.  A  new  metric 
is  proposed  which  may  overcome  some  of  these  problems  while  giving  a  strong  intuitive  model 
for  gauging  progress  through  design,  coding,  and  testing  phases  of  development.  A  proposal  is 
made  for  the  collection  of  data  for  testing  and  evaluation  of  this  model. 

A.  Purpose 

This  document  serves  as  an  updated  revue  of  software  metric  literature  and  a  pro¬ 
posal  for  developing  software  metric  tools.  It  can  be  used  as  a  source  book  for  entry  into  cur¬ 
rent  publications  regarding  metrics  and  a  reference  for  the  general  content  of  well  known 
metrics. 


B.  Application 


The  U.S.  Army  Missile  Command,  Software  Engineering  Directorate  is  responsible, 
as  a  Life  Cycle  Software  Support  Center,  for  maintenance  of  missile  system  embedded  soft¬ 
ware  and  for  a  technology  assessment  and  consultation  concerning  the  acquisition  of  such  soft¬ 
ware.  A  spectrum  of  metrics  is  required  for  maintenance  and  prediction  of  schedules  and  coun¬ 
seling  regarding  software  system  architectures  and  methodologies.  This  report  provides  insight 
into  characteristic  factors  which  can  be  and  have  been  measured.  The  references  give  access  to 
more  detailed  information. 

In  addition,  a  proposal  is  made  to  collect  data  from  the  various  projects  maintained 
by  the  directorate  to  more  effectively  evaluate  proposals  and  software  development  efforts  by 
contractors  developing  missile  system  software.  The  metric  model  presented  is  a  framework 
for  the  collection  and  evaluation  of  the  required  information. 

C.  Organization 


The  paper  first  surveys  management  indicators,  software  quality  data  collection  and 
measures,  and  software  structural  metrics.  Details  are  provided  concerning  the  factors  which 
contribute  to  each  metric.  Calculation  equations  are  presented  where  applicable.  Comments 
are  made  concerning  the  comparison  of  various  methods  and  models. 

An  evaluation  is  made  of  the  various  metrics  for  use  in  the  software  acquisition  proc¬ 
ess.  Measurable  factors  are  identified  in  relation  to  the  phase  in  the  development  cycle  during 
which  they  are  available  and  beneficial. 

A  unified  model  gives  a  framework  for  the  analysis  of  the  effects  of  the  various  fac¬ 
tors  on  subsequent  maintainability  of  the  software.  The  model  is  extended  to  enhance  the  in¬ 
tuitive  insight  gained  from  use  of  the  unified  model. 

A  mapping  is  made  between  the  metric  factors  and  the  structural  features  of  the  Ada 
programming  language. 
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n.  SURVEY  OF  SOFTWARE  METRICS 


A  sizable  library  of  software  metrics  is  available.  This  brief  survey  will  put  most  of  the 
commonly  used  ones  into  perspective  by  describing  their  content  and  elucidating  their  method¬ 
ologies.  They  are  classified  here  as  management  indicators,  software  quality  measures,  and 
structural  complexity  measures. 

A.  Management  Indicators 

Software  acquisition  demands  the  use  of  management  indicators  as  mandated  by  AR 
70-13  and  the  standard  review  processes  [13].  Government  and  contractor  staff  are  familiar 
with  these  measures,  but  they  are  included  here  for  completeness. 

There  are  some  good  and  fairly  widely  used  methods  of  predicting  project  size, 
schedule,  and  cost.  These  methods  are  empirical,  statistically  based,  and  oriented,  due  to  their 
databases,  toward  specific  applications.  For  example,  the  Constructive  Cost  Model 
(COCOMO)  [5]  for  computing  development  time  says 

MM  =  a(KDSI)t>m(x), 

where  MM  is  the  number  of  man-months  required  to  produce  the  software  product,  a  and  b  are 
empirically  derived  constants  obtained  from  production  mode  and  level  definitions,  Thousands 
of  lines  of  Delivered  Source  Instructions  (KDSI)  is  the  code,  and  m(x)  is  a  factor  computed 
from  cost-driving  attributes.  The  level  and  production  mode  parameters  reflect  the  size  and 
constraints  required  by  the  specific  project.  The  cost  driving  attributes  used  to  compute  m(x) 
are: 


Product  attributes 

required  software  reliability 
database  size 
product  complexity 
Computer  attributes 

execution  time  constraint 
main  storage  constraint 
virtual  machine  volatility 
computer  turnaround  time 
Personnel  attributes 
analyst  capability 
application  experience 
programmer  capability 
virtual  machine  experience 
programming  language  experience 
Project  attributes 

modem  programming  practice 
use  of  software  tools 
required  development  schedule. 

A  derivative  of  COCOMO  called  SECOMO  was  developed  by  the  Army.  These  measures  are 
highly  dependent  on  KDSI  which  may  not  be  easy  to  estimate  early  in  a  project.  None  of  these 
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factors  are  computed  from  the  design  structure  of  a  project’s  computer  programs.  If  there  were 
a  way  to  calculate  KDSI  from  the  architectural  and  communication  structure  of  the  early  top 
level  design,  more  confidence  could  be  placed  in  the  validity  of  progress  reports  based  on 
KDSI  dependent  measures.  For  example,  just  because  90  percent  of  the  program  code  is  com¬ 
plete  it  may  not  be  the  difficult  90  percent. 

An  approach  to  measuring  predicted  size  comes  from  function  point  analysis  [30],  a 
metric  based  on  both  software  characteristics  and  environment  [4],  The  size  of  a  program 
module  in  lines  of  source  code  is  computed  as 

SIZE(SLOC)  =  ( ARCH)(EXPF)((LANG  *  FPa)+ OOCN)a  > 

where  the  factors  are  defined  as: 

ARCH  =  architectural  factor 

EXPF  =  expansion  factor 

LANG  =  language  expansion  factor 

FPa  =  adjusted  function  point  count 

OOCn  =  normalized  operand/operator  count. 

a  =  reuse  factor. 

Each  of  the  factors  has  a  defined  range  and  is  adjusted  in  magnitude  for  the  specific  application 
being  sized.  Typically,  the  architectural  factors  would  be  defined  as: 


centralized 

1.0 

tightly  coupled  multiprocessor 

1.3 

loosely  coupled  multiprocessor 

1.5 

federated 

1.6 

distributed  with  central  database 

1.8 

fully  distributed 

2.1 

array  processor 

0.9 

The  expansion  factor  EXPF  is  a  product, 

EXPF  =  ck  nC=Q  SMi, 

where  ck  is  a  calibration  factor  and  SMi  is  a  size  modifier  which  can  be  defined  typically  as: 


requirements  volatility 

.95  to  1.18 

database  size 

.94  to  1.11 

degree  of  real  time 

.90  to  1.16 

use  of  modem  programming  techniques 

.93  to  1.11 

use  of  software  tools 

.89  to  1.10 

analyst  capability 

.89  to  1.19 

application  experience 

.91  to  1.15 

environment  experience 

.95  to  1.10 

language  experience 

.91  to  1.13. 

LANG  for  the  language  Ada  would  be  72  lines  of  code  per  function  point  with  a  correlation  of 
about  .887  for  Reifer’s  data  set.  The  function  point  count  FPa  is  the  sum  of  inputs,  outputs, 
master  files,  modes,  and  interfaces.  For  real  time  systems,  the  sum  would  include  stimulus-re¬ 
sponse  pairs  and  rendezvous. 

Function  point  analysis  depends  indirectly  on  the  internal  structure  of  the  source 
code.  It  has  reportedly  [30]  been  successful  in  measuring  28  projects  within  20  percent  and  has 
been  made  accessible  through  desk  top  computers.  But,  the  internal  structure  of  modules  is  not 
visible  enough  to  determine  whether  the  most  complex  or  difficult  work  on  a  project  has  been 
completed  or  even  well  defined. 

There  are  other  methods  in  addition  to  COCOMO  for  measuring  effort  as  surveyed 
by  [9].  These  include: 

1.  criteria  based  on  validity,  objectivity,  ease  of  use,  sensitivity, 
transportability,  and  other  subjective  top  level  qualities, 

2.  methods  based  on  least  squares  curve  fit  to  parameterized  linear 
and  non-linear  functions  of  time  which  represent  effort  level, 

3.  level  of  difficulty  models  which  are  calculations  based  on  subjective 
estimation  of  level  of  difficulty  of  various  phases  of  development, 

4.  statistical  models  using  regression  analysis  to  fit  polynomials  or  other  curves, 
such  as  the  Rayleigh  distribution,  to  data  on  software  development  time. 


All  of  these  management  indicators  are  intended  to  provide  top-level  and  not  a  de¬ 
tailed  view  of  the  software  and  [1]  its  development.  A  survey  of  management  indicators  shows 
measurements  for  computer  resource  utilization,  software  development  effort,  requirements  and 
definition  stability,  software  progress,  development  and  test,  cost/schedule  deviations,  and  the 
use  of  software  development  tools. 

B.  Software  Quality  Measures 

As  can  be  seen  from  a  general  survey  of  the  literature  on  software  metrics  [10],  there 
are: 


1.  classic  metrics  based  on  software  science,  cyclomatic  complexity,  and  function 
points, 

2.  life  cycle  metrics  based  on  analysis,  software  design,  code  structure,  quality 
assurance,  and  method, 

3.  code  metrics  designed  for  specific  languages, 

4.  new  metrics  based  on  the  development  process  and  effort,  graph  structure 
of  software,  and  information  content, 

5.  metrics  based  software  process  models. 

From  the  abundance  of  software  measuring  methods  there  should  come  better  metrics.  The 
ideal  metric  should  be  an  automatable  one  based  on  the  architectural  and  communication  struc¬ 
ture  of  the  abstract  software  system  such  that  progress  in  the  development  process  can  be 
measured  and  an  intuitive  judgement  can  be  made  as  to  the  complexity  and  quality  of  the  re¬ 
sulting  software  [24,28]. 
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Many  metrics  of  software  quality  and  style  indicators  use  internal  properties  of  code, 
some  being  very  subjective  and  others  being  automated  and  analytical.  Most  of  these  require 
source  code  for  analysis,  and  thus  cannot  be  initiated  in  the  design  phases  of  a  project.  But, 
these  metrics  give  valuable  insight  into  what  should  be  measured  in  order  to  determine  the  in¬ 
ternal  quality,  complexity,  and  developmental  progress  of  software.  In  contrast  to  management 
indicators,  quality  indicators  have  a  higher  level  of  resolution  with  respect  to  the  internal  char¬ 
acteristics  of  the  software. 


Generally,  the  highest  level  attributes  have  been  related  to  low  level  characteristics  of 
software,  but  not  in  a  quantitative  way.  For  example,  [31]  shows  the  relationships  demon¬ 
strated  in  Figure  1.  As  Rossan  points  out,  design  indicators  are  measured,  using  some  common 
scale,  from  programs  and  documentation  citing  attributes  present  in  the  software.  Management 
indicators  are  the  results  of  reviews,  inspections  and  tests,  and  software  behavior  using  behav¬ 
ioral  and  acquisitional  metrics.  What  is  needed  is  a  set  of  design  indicators  which  can  be  used 
as  the  basis  for  management  indicators  of  a  more  meaningful  nature. 


Attribute: 


Activities: 


Hierarchical 

decomposition 

function 

decomposition 

information 

hiding 

step-wise 

refinement 

structured 

programming 

life  cycle 
verification 

concurrent 

documentation 


CharacteQStifiS; 

coupling 


cohesion 


complexity 

well  defined 
interfaces 

readability 

ease  of 
change 

traceability 

visibility  of 
behavior 

early  error 
detection 


Figure  1. 


Another  approach  to  the  organization  of  metrics  for  software  quality  is  given  by  [3]. 
A  quality  metric  tree,  shown  here  in  indentured  form,  shows  the  relationships  of  higher  level 
properties  to  lower  level  software  characteristics: 


Quality 

correctness 

completeness,  consistency,  traceability 
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efficiency 

concision,  execution  efficiency,  operability 

flexibility  and  maintainability 

complexity,  concision,  consistency,  expandability,  generality,  modularity, 

self  documentation,  simplicity 
integrity 

auditability,  instrumentation,  security 

interoperability 

communication,  commonality,  generality,  data  commonality  modularity 

portability  and  reusability 

generality,  hardware  independence,  modularity,  self  documentation,  software 

system  independence 
reliability 

accuracy,  error  tolerance,  simplicity,  consistency,  modularity 

testability 

auditability,  instrumentation,  self  documentation,  simplicity,  complexity, 

modularity 

usability 

operability,  training 

The  lower  level  characteristics  specified  in  this  model  can  not  be  measured  economically  with 
today’s  technology.  Nevertheless,  all  of  these  factors  should  contribute  in  some  way  to  the 
evaluation  of  a  software  product.  Even  though,  as  pointed  out  by  [11],  it  is  difficult  to  have  a 
metric  which  can  measure  both  process  and  product,  there  should  be  a  way  for  these  factors  to 
provide  feedback  to  developers  and  programmers  as  the  project  unfolds. 

One  way  to  provide  such  feedback  is  the  work  sheets  which  result  from  reviews.  In 
fact,  work  sheets  [34]  and  check  lists  [29]  often  provide  a  principle  medium  for  measuring  soft¬ 
ware  quality.  Manuals  have  been  written  [26]  which  present  trade-offs  generated  by  software 
standards  and  delineate  quality  factor  rating  guidelines  based  on  subjective  work  sheets.  Integ¬ 
rity,  maintainability,  portability,  reusability,  usability,  testability,  flexibility,  and  inter¬ 
operability  are  all  at  odds  with  program  code  efficiency  in  the  traditional  sense.  Flexibility, 
interoperability,  and  reusability  must  be  balanced  with  the  code  integrity.  Reusability  must  be 
measured  with  respect  to  reliability.  Factors  such  as  these  can  be  measured  subjectively  using 
worksheets  and  quality  factor  rating  guidelines.  Factors  and  metrics  used  in  such  evaluations 
are  typically  [26]  quality  of  comments,  complexity,  completeness,  operability,  user  interface, 
data  commonality,  effectiveness  of  comments,  traceability,  consistency,  training,  and  communi¬ 
cation  commonality.  Check  sheets  generate  values  which  form  a  matrix  for  k  modules  and  n 
module  level  measurements  as 

mi  l  mi  2  •  •  •  mi  it 

Mdm  =  ’  . 

nin  l  ■  •  ■  m„  it 

From  this  matrix,  indicators  can  be  calculated  such  as,  for  metric  i,  the  average  and  distance 
from  mean  would  be 
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Ai  *  if.,  (M,j/k) 
at  .  (Mij-A^/k  . 

Based  on  the  result,  a  module  j  would  be  reported  for  examination  if 

Mij  <  Aj-<rj  . 


Factors  in  this  model  can  be  normalized  by  data  from  previous  projects.  Thus,  resolution  and 
flexibility  for  application  are  available  using  these  methods,  but,  a  more  objectively  based  met¬ 
ric  would  be  preferable. 

Another  aspect  of  software  quality,  which  must  be  measured  for  the  purpose  of  soft¬ 
ware  acquisition,  is  supportability.  A  supportability  metric  has  been  developed  by  Frank 
Blackwell  of  the  Army  Missile  Command’s  Software  Engineering  Directorate.  Tliis  model  is 
based  on  the  COCOMO  model  [5].  The  Subsystem  supportability  factor  is  calculated  by  sum¬ 
ming  the  values  for  each  factor  selected  in  a  supportability  matrix  and  then  subtracting  the  total 
from  100.  A  subsystem  scoring  the  highest  supportability  in  each  factor  will  receive  a  rating  of 
100  (excellent).  A  subsystem  scoring  nominal  in  each  factor  will  receive  a  rating  of  75  (fair). 
Any  subsystem  scoring  below  60  is  considered  to  have  unacceptable  supportability.  Also,  a 
subsystem  can  be  rated  as  having  unacceptable  supportability  if  the  Memory  or  Throughput 
Utilization  exceeds  a  specific  value.  The  overall  subsystem  suppportability  average  is  calcu¬ 
lated  by  multiplying  each  of  the  subsystem  supportability  ratings  by  the  subssystem  source 
lines  of  code,  totaling  the  scores,  and  then  dividing  by  the  total  source  lines  of  code.  The  four 
system  supportability  factors  are  then  totaled  and  added  to  the  subsystem  average  for  the  sub¬ 
system  supportability  rating.  If  a  subsystem  is  determined  unsupportable,  the  system  suppor¬ 
tability  should  be  calculated  without  the  subsystem  and  the  unsupportable  subsystem  should  be 
identified.  The  model  can  be  used  to  determine  deficient  areas  and  areas  where  improvement 
will  result  in  a  higher  supportable  rating.  The  supportability  factors  and  their  definitions  are 
shown  in  Figure  2. 

Indirect  approaches  to  quality  measurement  have  been  used  such  as  examination  of 
the  design  documentation  as  the  primary  data  source.  Taxonomies  have  been  developed  to  aid 
in  such  documentation  analysis.  The  following  documentation  tree  [33]  shows  many  factors 
common  to  the  quality  of  the  program  code  itself. 


adequacy 


accuracy 

requirement/design  traceability 

(top  down,  bottom  up  equivalence  ) 

consistency 

conceptual 

(invariance  of  concept) 

factual 


interface,  database  security,  error  recovery,  I/O, 
performance,  timing 
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completeness 

domain  coverage 
document  relationships 

decomposition  (refinement  enunciation) 
referential 

TBD/TBS,  %  missing,  %  appropriate 

modification  tracking 
code 

documents 

usability 

logical  traceability 

references  (TBD/TBS,  missing,  appropriate) 
term  consistency 

sufficiency  of  index  and  table  of  contents 
intra-document  completeness 
readability  (consistency,  standards) 
physical  (print,  format,  modularity) 
accessibility/availability 
expandability 

In  this  model,  the  recurring  themes  of  measurable  upper  level  supported  by  subjectively  meas¬ 
urable  lower  level  parameters  appear.  The  dependence  on  secondary  characteristics  introduces 
another  level  of  abstraction  away  from  measuring  the  actual  software  architecture. 

To  some  extent  the  quality  of  software  can  be  determined  by  the  nature  of  the  fault 
structure  which  emerges  during  testing  and  integration.  Fault  analysis  is  an  important  part  of 
measuring  software  quality  and  reliability.  Standard  metrics  for  fault  content  [20]  are: 

1.  fault  density  per  Thousand  Lines  of  Source  Code  (KLOQ, 

2.  defect  density  per  KLOC  based  on  defect  found  in  reviews, 

3.  cumulative  failure  profile 

4.  fault-days  and  various  combined  defect  indices. 

Fault  analysis  is  a  two  edged  sword  in  that  the  more  faults  collected  for  the  analysis 
the  more  valid  the  model  is;  yet,  more  faults  (disregarding  seed  faults)  undermine  confidence  in 
the  system.  One  type  of  fault  analysis  which  can  be  easily  automated  is  the  determination  of 
variables  defined  but  not  used  [36],  This  method  is  based  on  predicate  calculus  models  of  re¬ 
quirements  and  leads  *o  specification-dependent  testing  of  software.  The  system  generates  test 
data  and  programs.  In  this  case,  an  evaluation  metric  is  not  the  end  product.  Another  example 
of  a  fault  analysis  system  produces  random  inputs  independently  from  the  input  domain  accord¬ 
ing  to  a  typical  operation  distribution.  Errors  produced  are  counted  and  analyzed  using  deter¬ 
ministic  Baysian  and  Markov  error  counting  models.  Although  automated  tools  can  be  used  to 
collect  fault  data,  the  major  problem  remains  in  not  having  the  metrics  early  enough  in  the  pro¬ 
ject’s  development. 

Intuitively,  design  quality  can  contribute  to  fewer  faults  for  correction  in  the  end 
product.  In  view  of  this,  measurement  of  specific  attributes  of  the  software  resulting  from  a 
design  would  be  helpful.  An  example  of  such  an  approach  is  [2].  This  particular  method 
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collects  data  on  the  properties  of  developed  program  code  to  determine  whether  a  given  devel¬ 
opment  methodology  can  generate  high  quality  code.  The  factors  used  in  this  analysis  are  de¬ 
scribed  by  the  following  tree: 

Reusability 

Hierarchical  decomposition 
Information  hiding 
coupling 
cohesion 

well  defined  interface 

globals,  passed  parameters,  execution  coupling 
data  structure  coupling,  parameterless  calls 
ease  of  change 
complexity 

Functional  decomposition 
Concurrent  documentation 

The  goal  was  to  design  an  automated  system  for  computing  the  quality.  There  are  automated 
tools  commercially  available  which  claim  to  calculate  numbers  for  source  code  quality.  One 
such  tool  is  ADAMAT.  Unfortunately,  the  metric  is  proprietary  making  analytic  evaluation 
difficult.  Another  commercial  tool  is  called  Logiscope  from  the  French  company  Verlog  [35]. 
It  collects  and  analyzes  statistics  from  source  code  of  programs.  This  package  uses  the  metrics 
of  Halstead,  McCabe,  and  Mohanty  discussed  later  in  this  paper. 

In  order  to  summarize  standard  metrics  being  used,  the  IEEE  has  published  a  stan¬ 
dard  dictionary  of  software  measures  [20]  which  presents  a  collected  reference  list  of  descrip¬ 
tions  of  useful  metrics.  The  list  of  Table  1  categorizes  the  measures  into  seven  groups.  The 
number  in  parentheses  is  the  number  of  the  measure  as  it  appears  in  the  dictionary.  Measures 
preceded  by  I  are  intensive  and  those  preceded  by  E  are  extensive  in  nature. 

TABLE  1.  Software  Measures. 


Based  on  faults 

I  (1)  Fault  Density 

I  (2)  Defect  Density 

E  (3)  Cumulative  Failure  Profile 

E  (4)  Fault-Days  Number 

I  (8)  Defect  Indices 

E  (9)  Error  Distribution 

I  (11)  Manhours  Per  Defect 

I  (20)  Mean  Time  To  Discover  the  Next  K  Faults 

I  (21)  Purity  Level 

I  (22)  Estimated  Number  of  Faults  Remaining  (by  seeding) 
I  (27)  Residual  FauH  Count 

I  (28)  Failure  Analysis  by  Elapsed  Time 

I  (29)  Testing  Sufficiency 

I  (30)  Mean  Time  to  Failure 

I  (31)  Failure  Rate 

I  (36)  Test  Accuracy 

E  (38)  Independent  Process  Reliability 
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TABLE  1.  Software  Measures  (continued). 


Based  on  Requirements 

• 

I  (5)  Functional  or  Modular  Test  Coverage 

I  (6)  Cause  and  Effect  Graphing 

I  (7)  Requirements  Traceability 

I  (10)  Software  Maturity  Index 

E  (12)  Number  of  Conflicting  Requirements 

E  (17)  Minimal  Unit  Test  Case  Determination 

I  (23)  Requirements  Compliance 

I  (24)  Test  Coverage 

I  (35)  Completeness 

Related  to  Test  Design 

I  (5)  Functional  or  Modular  Testing  Coverage 

E  (17)  Minimal  Unit  Test  Case  Determination 

I  (18)  Run  Reliability 

I  (24)  Test  Coverage 

I  (26)  Reliability  Growth  Function 

Related  to  Variable  Counts 

E  (14)  Software  Science  Measures 

E  (25)  Data  or  Information  Flow  Complexity 

Related  to  Software  Structure 

E  (13)  Number  of  Entries  and  Exits  per  Module 

E  (15)  Graph-theoretic  Complexity  for  Architecture 

E  (16)  Cyclomatic  Complexity 

E  (32)  Software  Documentation  and  Source  Listings 

Based  on  Performance 

E  (37)  System  Performance  Reliability 

I  (39)  Combined  Hardware  and  Software  Operational  Availability 

Based  on  Management  Parameters 

I  (33)  RELY  (Required  Software  Reliability) 

I  (34)  Software  Release  Readiness 

In  terms  of  internal  software  attributes,  there  are  several  important  factors  to  determine  in 
measuring  software  quality.  Table  2  shows  some  factors  and  how  they  might  be  measured 
during  the  various  phases  of  software  development.  Experimentation  must  determine  which 
factors  correlate  with  errors  and  maintenance  and  to  what  extent.  There  is  large  potential  for 
significant  work  in  this  area.  As  pointed  out  in  the  previous  sections  of  this  paper,  several  fac¬ 
tors  and  metrics  have  already  been  identified  as  significant. 
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TABLE  2.  Sources  for  Metric  Data. 


factor  \phase 

HMCD 

CM CD 

DD 

PDL 

CODE 

TEST 

number  of  modules  accessing 
data  variable 

C 

C 

C 

C 

C 

AEC 

control  variable 

C 

C 

C 

C 

C 

AEC 

data  structure 

C 

C 

C 

C 

C 

AEC 

subprogram 

- 

C 

- 

C 

C 

AEC 

exception 

- 

C 

- 

C 

C 

AEC 

compilation  unit 

— 

C 

— 

C 

C 

AEC 

factor  \phase 

HMCD 

CM  CD 

DD 

PDL 

CODE 

TEST 

items  accessed  by  a  module 


data  variables 

EST 

C 

C 

C 

C 

AEC 

control  variables 

EST 

c 

c 

c 

C 

AEC 

internal  variables 

— 

— 

c 

c 

C 

AEC 

internal  data  struct 

— 

c 

c 

c 

C 

AEC 

data  structures 

C 

c 

c 

c 

C 

AEC 

files 

c 

c 

c 

c 

C 

AEC 

subprograms 

EST 

c 

— 

c 

C 

AEC 

rendezvous 

C 

c 

— 

c 

C 

AEC 

exceptions 

- 

c 

c 

c 

C 

AEC 

compilation  units 

- 

c 

- 

c 

C 

AEC 

states 

- 

— 

— 

c 

C 

AEC 

operation  modes 

EST 

c 

— 

c 

C 

AEC 

execution  paths 

- 

— 

— 

c 

C 

AEC 

requirements  met 

C 

c 

c 

c 

C 

AEC 

algorithm  determinacy 

EST 

EST 

— 

EGT 

EGT 

ICB 

commentary  description 

ICB 

ICB 

- 

EGT 

EGT 

— 

Valid  loop  termination 

— 

— 

— 

— 

ICB 

ICB 

lines  of  code 

EST 

EST 

- 

EST 

C 

- 

factor  \phase 

HMCD 

CMCD 

DD 

PDL 

CODE 

TEST 

variables’  parameters 

scope 

- 

TSH 

CPA 

TSH 

TSH 

AEC 

value  range 

- 

— 

CPA 

CPA 

CPA 

AEC 

type 

- 

- 

EGT 

EGT 

EGT 

EGT 

type  variegation 

- 

- 

C 

C 

C 

C 

access  mechanism 

- 

— 

EGT 

EGT 

EGT 

AEC 

containing  structure 

— 

TCS 

TCS 

TCS 

TCS 

AEC 

effect  (data,  control) 

- 

EGT 

EGT 

EGT 

EGT 

AEC 

exceptions 

— 

C 

C 

C 

C 

AEC 

machine  dependencies 

- 

— 

ICB 

ICB 

ICB 

AEC 

initial  value 

— 

— 

ICB 

ICB 

ICB 

ICB 
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TABLE  2.  Sources  for  Metric  Data  (continued). 


units/scale 
error  containment 
validity  proveability 
loop  termination 
name  length 
requirement  item 
format 

internal  cohesion 
volatility 

commentary  description 


EGT 
EGT  EGT 


ICB 

C  C  C 

EGT 

EGT 

EGT 

ICB 


ECI 

ECI 

AEC 

ECI 

ECI 

AEC 

ECI 

ECI 

AEC 

ICB 

ICB 

ICB 

ICB 

ICB 

— 

C 

C 

ICB 

EGT 

EGT 

ICB 

EGT 

EGT 

— 

EGT 

EGT 

AEC 

EGT 

EGT 

— 

Abbreviations: 


column  headings: 

HMCD 

high  level  module  communication  diagram 

CMCD 

complete  module  communication  diagram 

DD 

data  dictionary 

PDL 

program  development  language  document 

CODE 

source  code  for  the  computer  program 

TEST 

test  results  statistics  report 

methods  of  measurement 

AEC 

count  in  actual  execution 

C 

count  all  potential  occurrences 

CPA 

count  possible  occurrences  in  range 

ECI 

evaluate  computational  impact 

EGT 

evaluate  graded  types 

EST 

use  pre-defined  estimation  factor 

ICB 

increment  a  count  of  boolean  values 

TCS 

trace  and  count  containing  structures 

TSH 

count  total  through  sub-hierarchy 

- 

measurement  not  appropriate 

In  the  design  phase,  most  of  the  evaluation  must  be  subjective.  Initial  software  mod¬ 
ule  communication  diagrams  can  be  used  to  establish  and  evaluate  encapsulation  and  abstrac¬ 
tion  of  variables,  data  structures,  external  devices  and  subprogram  units.  The  Hrair  limit  limit¬ 
ing  the  number  of  structures  at  any  one  level  can  be  enforced  by  counting  modules.  The  initial 
graphs  can  be  used  to  aid  construction  of  a  requirements  traceability  matrix  and  a  data  diction¬ 
ary  which  contain  much  more  precise  data  about  the  system  communication  architecture.  As 
the  system  architecture  is  recursively  refined,  more  precision  is  available  for  measurable  quan¬ 
tities.  The  use  of  development  tools  will  greatly  enhance  not  only  the  collection  of  data  for 
metrics  but  also  the  enforcement  of  development  standards  which  can  improve  system  reliabil¬ 
ity. 

When  designing  and  writing  programs,  the  programmer/analyst  is  aware  of  specific 
attributes  which  add  or  reduce  the  chaos  or  improve  the  probability  of  success  of  the  software 
as  an  entity.  Low  scores  are  desirable  for  the  m(x)  factor  in  the  COCOMO  model,  low  scores 
are  desirable  for  function  point  parameters,  high  ratings  are  desirable  for  the  ilities”  of  sub- 
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jective  software  quality,  low  probability  of  faults  is  needed,  high  scores  on  commercial  metrics 
are  desired,  and  low  complexity  metric  values  are  sought.  In  order  to  measure  the  quality  of 
software,  key  elements  from  the  usable  indicators  need  to  be  extracted  and  distilled  into  a  well 
formulated,  consistent,  and  validatable  model.  To  achieve  coveted  values  for  metrics,  the  code 
must  adhere  to  certain  standards  which  are  measurable  from  the  design  and  code  of  the  soft¬ 
ware.  For  some  examples,  it  must 

1.  exhibit  a  low  degree  of  coupling  between  modules 

2.  have  a  highly  coherent  algorithmic  structure 

3.  demonstrate  good  encapsulation  of  data  structures,  procedures,  and  functions,  with 
proper  exception  handling, 

4.  use  well  defined  typing  of  variables  including  value  limits,  and  initialization, 

5.  have  a  standard  well  formed  architecture, 

6.  demonstrate  adequate  commentary, 

7.  use  only  portable  features  of  the  implementation  language. 


None  of  the  methods  seen  thus  far  can  measure,  from  design  through  test,  the  internal 
quality,  complexity,  and  progress  of  computer  program  development.  In  other  words,  the  stan¬ 
dard  measures  for  software  fall  short  of  being  able  to  measure  effectively  the  factors  which 
contribute  to  difficulty  of  creation  and  maintenance.  There  should  be  an  automatable  software 
metric  designed  which  can  integrate  quality  and  architectural  factors  into  an  intuitively  inter¬ 
pretable  dynamic  gauge  of  the  product. 

C.  Supportability  Factors 

1.  Distinct  programming  languages  -  The  number  of  distinct  programming  lan¬ 
guages  utilized  in  the  system 

2.  Distinct  Architectures  -  The  number  of  distinct  hardware  architectures  utilized 
in  the  system.  Families  of  processors,  such  as  the  680X0,  are  considered  a  single  architecture. 

3.  Delivery  -  The  level  of  Life  Cycle  Software  Support  Environment  hardware, 
support  software,  and  documentation  delivered. 


NOMINAL 

LOW 

VERY  LOW 
EXTRA  LOW 


. .  all  commercial,  project  funded 

I  . .  proprietary  software  rights 

!  I  T .  all  applicable  licenses 

•  •  , 

;  ;  .  . .  executing  hardware 

!  «  ;  . .  software  documentation 

.  .  •  ,  • 

•  •  '  •  •  : .  software  users  manuals 

I  I  I  i  i  i 

I  !  I  ;  !  ,*  r  ”  ”  proper  operation  is  demonstrated 

X  X  X  X  X  X  X 

X  X  X  X  X 

X  X 

X 
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4.  Software  Management  -  the  level  of  software  configuration  management,  soft¬ 
ware  quality  management,  and  management  insight  into  the  software  development  process  to 
ascertain  satisfactory  development  progress  and  status  accounting. 


-  -  -software  quality  assurance  independent  of  the  software  development 
;  project  management,  formal  procedures  to  assure  periodic  management 
;  review  of  the  status  of  the:  software  development  process,  mechanisms 
;  in  place  for  assuring  that  software  subcontractors  follow  a  disciplined 
;  software  development  process,  formal  configuration  management  of  the 
;  tactical  and  support  software, 

;  !  --coding  standards,  internal  independent  verification;  and  validation,  and 
;  software  development  indicators. 

|  '  •- Informal  software  configuration  management,  limited  software  quality 

•  •  •  assurance,  and  minimal  design  reviews 
HIGH  XX 

NOMINAL  X 

LOW  X 

5.  Subsystem  Size  -  The  size  of  the  software  subsystem.  The  subsystem  size  fac¬ 
tor  takes  into  account  the  size,  in  lines  of  code,  of  the  subsystem  software  and  the  implementa¬ 
tion  language.  ADA  HOL  -  Software  programmed  in  Ada  and  compiled  on  a  validated  Ada 
compiler.  NON-ADA  STANDARD  HOL  -  Software  programmed  in  a  standard  high  order 
language  other  than  Ada.SPECIAL  APPLICATION  ASSEMBLY  LANGUAGE  -  Software 
programmed  in  assembly  language  for  a  special  purpose  processor. 

6.  Design  Complexity  -  The  complexity  of  the  software  subsystem. 

7.  Memory  Utilization  -  The  program  instruction  and  data  storage  memory 
utilization  of  the  target  processors). 

8.  Throughput  Utilization  -  The  throughput  utilization  of  the  target  processor(s). 

9.  Program  Design  Language  (PDL)  Implementation  -  The  PDL  used  in  develop¬ 
ing  the  subsystem  software. 

HIGH  -  An  Ada  format  PDL  which  can  be  successfully  compiled  by  a  validated  Ada  compiler 
is  used  during  the  design  of  the  software. 

NOMINAL  -  A  non-Ada  PDL  is  used  in  designing  the  software. 

LOW  -  No  PDL  is  used  in  designing  the  software. 

10.  Processor  Type  -  The  type  of  processing  element  which  executes  the  subsystem 

software. 

HIGH  -  The  software  is  executed  on  a  commercial  computer. 

NOMINAL  -  The  software  is  executed  on  a  standard  general  purpose  processor. 

LOW  -  The  software  is  executed  on  a  special  purpose  processor. 

VERY  LOW  -  The  software  is  executed  on  a  processor  developed  specifically  for  the  appli¬ 
cation. 


11.  Computer  Turnaround  Time  -  The  computer  response  time  of  the  software 
support  environment  (compile  time,  etc.).  TURN:  Computer  Turnaround  Time. 

12.  Modern  Programming  Practices  -  The  degree  to  which  modern  software 
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engineering  and  programming  practices  are  used  in  developing  the  software.  MODP:  Use  of 
Modem  Programming  Practices. 

13.  Tools  -  The  degree  to  which  automated  software  tools  are  used  in  developing 
the  software  subsystem. 

TOOL:  Tool  support. 

! . development  tools  commercially  available 

1  ! . on  call  maintenance  is  available 

;  ;  ; . maintenance  contract  is  available 

;  •  ;  ! . support  is  available 

'•  !  '•  !  ! . custom  tools 


VERY  HIGH  X  XXX 
HIGH  x  x  x 

NOMINAL  X  X 

X  X  development  tools  obsolete.  No  support. 


14.  Software  Documentation  -  The  level  of  software  documentation  developed  for 
the  software  subsystem. 


HIGH 

NOMINAL 

LOW 

VERY  LOW 


; .  Approved  Standard  Software  Documentation 

;  !- . Independent  Verification  and  Validation 

;  ! . documentation  includes  software 

!  ;  !  software  design  documents,  software  test 

!  |  !  plans,  procedures,  and  reports, 

’•  \  ! .  requirements  specifications,  product 

!  I  !  !  specification. 

'•  !  ; . contractor  format. 

X  XXX 
X  XX 
X 

X 


15.  Software  Testing  -  The  level  of  software  testing  performed  during 
software  development. 


; . Approved  Standard  Software  Testing 

!  ! . Independent  Verification  and  Validation 

1  i  ,* . testing  includes  unit,  major  components, 

;  ;  ;  system  integration  testing 

j  •  ;  !- . customer  witnessing  of  tne  Formal  tests 

<  I  <  !  ! . Internal  contractor  procedures. 

HIGH  X  XXX 

NOMINAL  X  XX 

LOW  X 

VERY  LOW  X 
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D.  Structural  Complexity  Measures 

There  are  factors  and  characteristics  of  the  software  which  are  known  very  early  in 
the  design  process  and  which  can  be  measured  throughout  the  development  process.  The  focus 
of  effort  should  be  on  identifying  those  factors  and  integrating  them  into  a  validatable  metric 
based  on  software  internal  properties  yet  reflecting  quality  and  development  progress  at  the 
management  level.  There  is  some  research  leading  in  this  direction. 

There  are  two  classical  metrics  of  code  structure.  Software  science  [17]  uses  the 
number  of  distinct  operators  nl,  the  number  of  distinct  operands  n2,  the  total  number  of  opera¬ 
tors  Nl,  and  the  total  number  of  operands  N2  to  calculate  several  characteristics  such  as 


program  vocabulary 

= 

1 

= 

nl  +  n2 

observed  length 

= 

L 

s 

Nl  +  N2 

estimated  length 

= 

L’ 

= 

nl(log2nl)  +  n2(log2n2) 

volume 

= 

V 

L  (login) 

difficulty 

— 

D 

s 

(nl/2)(N2/n2) 

program  level 

— 

LI 

= 

1/D 

effort 

— 

E 

s 

V/Ll 

number  of  errors 

s 

B 

= 

V  /  3000  =  E2/3  /  3000 

alternate  length 

= 

L” 

= 

log2  ((nl)  ! )  +  log2  ((n2)  !  ) 

of  which  volume  is  most  often  cited.  The  other  classical  metric  is  cyclomatic  complexity  [25] 
in  which  a  directed  linear  graph  of  the  software  execution  path  structure  is  analyzed.  The 
graph  is  made  strongly  connected  by  joining  end  to  beginning.  Then,  N  is  the  number  of 
nodes,  E  is  the  number  of  edges,  SN  is  the  number  of  splitting  nodes,  RG  is  the  number  of  re¬ 
gions  of  the  resulting  graph.  The  complexity  C  is  calculated  as 

C  =  E-  N+1  =  RG  =  SN  +  1. 

Both  of  these  metrics  are  often  used  as  a  baseline  for  evaluating  other  metrics  because  data  are 
relatively  easy  to  obtain. 

A  most  notable  effort  in  the  right  direction  is  the  work  on  software  metrics  based  on 
information  flow  [18,  19].  This  method  uses  a  lexical  approach  to  measuring  system  connec¬ 
tivity.  Passed  parameters  and  accesses  to  global  data  stores  are  tallied  to  give  the  number  of 
inputs  and  outputs  for  each  module.  Lines  of  source  code  are  also  tallied.  The  complexity  of  a 
module  is  calculated  as 

complexity  =  (lines  of  code)*  (inputs*  outputs)2. 

The  lines  of  code  estimate  turns  out  to  be  a  non-critical  factor  and  does  not  need  to  be  precise. 
In  fact,  it  was  noted  that  the  length  factor  may  detract  from  the  accuracy  of  the  metric. 

Thus,  the  complexity,  calculated,  reveals  several  things  about  the  structure  of  the 
code.  For  example,  if  the  number  of  inputs  and  outputs  is  high,  the  module  may  be  implement¬ 
ing  more  than  one  elementary  function.  Large  input*output  can  indicate  stress  points  in  the 
system  because  more  effect  on  the  system  is  indicated.  Inadequate  refinement  of  modules  also 
leads  to  large  I/O  product.  Thus,  high  complexity  for  a  module  may  not  indicate  a  specific 
problem  but  can  show  that  there  is  a  problem.  The  complexities  of  individual  modules  are 
linearly  summed  to  give  the  complexity  of  subsystems  and  systems. 
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Information  flow  metrics  correlate  well  with  change  records  of  UNIX  operating  sys¬ 
tem  maintenance  [22].  In  comparing  with  other  standard  complexity  metrics,  this  method  does 
quite  well.  Correlation  factor  with  system  errors  was  .95  for  information  flow,  .96  for 
Halstead,  and  .89  for  McCabe.  Halstead  and  McCabe  correlated  at  .84  but  information  flow 
correlated  .38  with  Halstead  and  .35  with  McCabe.  This  indicates  a  fair  degree  of  or¬ 
thogonality  between  information  flow  and  the  classical  metrics  of  Halstead  and  McCabe. 

Another  metric  [14]  uses  an  abstract  state  machine  description  of  the  software.  A 
functional  basis  is  used  for  state  definitions.  Links  are  then  established  in  semantic  nets  which 
reflect  the  requirements.  The  net  links  objects,  sets  and  actions  embodied  in  the  system.  The 
state  machines  are  described  in  three  ways;  by  enumeration  listing  the  states,  by  axiomatics 
listing  logical  conditions  characterizing  the  system,  and  by  algorithmic  analysis  defining  range 
and  domain.  In  this  case,  system  analysis  is  made  manageable  by  a  hierarchical  decomposition 
of  the  system.  The  algorithmic  paradigm  provides  a  way  to  extend  the  state  description  to 
mathematical  methods.  The  most  difficult  part  of  the  analysis  is  the  mapping  of  requirements 
to  specifications.  However,  the  method  does  reduce  ambiguity  by  delineating  explicit  relations 
in  a  defined  context. 

Structure  and  style  contribute  to  software  quality  and  should  be  factored  into  metrics. 
For  example,  encapsulation  of  data,  procedures,  functions,  and  data  structures  can  play  an  im¬ 
portant  role  in  the  successful  development  of  a  software  project  particularly  in  avoiding  lurking 
side  effects.  A  lurking  side  effect  is  one  of  those  little  illicit  data  item  changes  that  jump  out  to 
byte  the  unsuspecting  software  maintainer/user  in  the  most  embarrassing  moment.  The  varying 
degrees  of  coupling  between  software  modules  can  have  varying  degrees  of  impact  on  the  ar¬ 
chitectural  structure.  The  quality  of  cohesion  within  a  module  effects  the  conceptual  complex¬ 
ity  of  the  module  and  should  in  some  way  be  reflected  in  a  quality  metric.  Yet,  all  criteria  for 
good  software  cannot  be  measured.  Indeed,  there  are  some  software  standards  which  should  be 
accepted  as  minimum  attributes  requiring  no  measurement. 

There  are  issues  critical  to  software  development  and  quality  which  are  not  addressed 
by  the  available  metrics.  Computability  and  formal  verification  form  entire  disciplines  malig¬ 
nant  with  active  research.  A  great  amount  of  maturity  will  be  required  in  these  areas  before 
relevant  metrics  can  be  integrated  into  system  evaluation. 

There  are  then  many  holes  in  the  software  metric  picture.  Obviously,  that  is  why 
much  research  is  being  published.  Perhaps  there  is  a  way  that  many  diverse  properties  and 
concerns  related  to  success  of  software  systems  can  be  brought  together  under  a  single  simple 
but  powerful  intuitive  model  which  generates  some  useful  metrics.  A  concept  which  did  a 
similar  service  in  thermodynamics  and  information  theory  is  the  concept  of  entropy. 
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III.  A  UNIFIED  PARADIGM  FOR  SOFTWARE  METRICS 

Rather  than  relying  on  several  metrics  at  various  respective  stages  of  software  develop¬ 
ment,  this  section  describes  a  model  by  which  some  of  the  measures  can  be  integrated  into  a 
single  one  based  on  a  physical  analogy. 

The  concept  of  entropy  has  been  suggested  to  interpret  the  development  of  software  as  a 
process  of  reducing  entropy  of  the  system  design  [21].  Evaluation  of  hardware  complexity  has 
also  been  done  [23].  Notable  use  of  an  entropy  measure  has  been  done  in  evaluating  software 
design  [27].  Entropy  has  been  used  as  a  metric  for  software  complexity  relative  to  cellular  ar¬ 
ray  machines  [15].  These  uses  of  the  concept  of  entropy  are  obviously  different  and  not  to  be 
confused,  but  the  analogies  are  very  useful  intuitively  when  used  with  care. 


In  physics,  change  in  entropy  S  can  be  defined  as  an  integral  of  the  reciprocal  of  tempera¬ 
ture  T  with  respect  to  the  differential  of  heat  dQ.  For  a  reversible  process  with  volume  chang¬ 
ing  from  Vi  to  V2,  the  change  in  entropy  Si  to  S2  is 

S2  -  Si  =  /  dQ/T  *  k(  In  V2-  In  Vi  ), 

giving  entropy  a  basically  logarithmic  form  [16].  The  base  of  the  logarithm  and  the  constant  k 
determine  the  units  of  measure.  In  communication  theory,  the  logarithm  base  is  2  so  that  the 
unit  of  measure  is  the  bit.  From  communication  theory  [32,  7],  entropy  is  expressed  as 

S  =  -  2i  pi  log2  pi, 

where  pi  is  the  probability  of  message  i  occurring  and 

Si  pi  =1. 

In  software  design  evaluation,  entropy  is  expressed  as 

H  (Pi . Pn)  =  Sf  =  1  M  log2  M, 

1*1  1*1 1 

where  xi ,  ...  ,  xn  are  distinct  classes  of  subsystems  called  and 

Pi  =  |  Xi  |  /  |  x  | 

is  the  probability  of  subsystem  xi  being  activated.  For  software  complexity  on  cellular  array 
machines,  entropy  is  defined  as  the  maximum  performance  factor  over  the  array  maxa  PFc 
times  the  number  c  of  cells  in  the  array  or 

S  =  c*maxa  PFc. 

The  performance  factor  PFc  is  defined  in  terms  of  SPFd,  the  product  of  the  number  of  states 
used  times  Hamming  distance  between  binary  state  use  vectors,  and  LPFci,  the  product  of  the 
number  of  communication  links  used  times  the  Hamming  distance  between  binary  link  use  vec¬ 
tors,  as 
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PFc  =  Si  log2  (1  +  SPFc  i  +  LPFci). 


None  of  these  metrics  provide  what  is  needed  for  software  tracking.  However,  by  nature, 
entropy  should  be  linearly  summable;  yet  the  combinatoric  nature  of  error  probabilities  suggest 
that  factors  should  be  combined  multiplicatively.  Therefore,  the  traditional  logarithmic  nature 
seems  correct.  What,  then,  will  be  the  precise  form  and  what  are  the  contributing  factors  for 
software  entropy? 

For  software,  entropy  should  indicate  in  some  sense  the  possibility  of  an  error  occurring. 
Many  factors  contribute  to  errors  such  as  typographical  errors,  variables  exceeding  boundaries, 
logic  errors,  shear  size  of  the  project,  the  nature  of  the  language  used,  syntax  errors,  etc.  Re¬ 
ducing  the  possibility  of  error,  reducing  entropy,  is  done  in  different  ways  at  different  times  in 
the  development  process.  For  example,  during  requirements  specification,  limits  and  perform¬ 
ance  parameters  must  be  clearly  spelled  out  keeping  in  mind  the  development  methodology  to 
be  used.  In  the  initial  design  phase,  entities,  operations,  and  communication  must  be  formu¬ 
lated  to  minimize  entropy  by  using  sound  architecture  and  methodology.  During  testing,  each 
test  should  lower  system  entropy  by  proving  doubtful  constructs  in  the  code  and  verifying  that 
requirements  are  met.  When  errors  are  found  in  testing,  entropy  increases.  New  tests  must  be 
designed  to  again  lower  entropy  to  pre-error-detection  levels.  Thus,  to  measure  entropy,  dif¬ 
ferent  data  must  be  used  in  different  phases  of  a  project  yet  scale  factors  must  be  included  for 
compatibility  of  the  metric  throughout  the  project.  The  process  would  be  structured  as  shown 
in  Figure  2. 

The  basic  form  for  software  entropy  is 

SWENT  *  Zj  Cj  [  Zi  ( (1/T)  log  (sfp) )  i  ]  j, 

where  Cj  is  an  empirical  constant  based  on  the  software  development  phase  j  under  analysis. 
The  external  sum  is  taken  over  the  various  phases.  The  internal  sum  is  taken  over  the  individ¬ 
ual  modules  i  of  the  software.  The  factor  T  reflects  the  quality  of  design  usage  of  entities  such 
as  variables,  subprograms,  and  data  structures.  The  factors  reflects  the  number  of  operational 
states  or  modes  of  the  module  i.  The  factor  f  measures  the  interface  size  for  the  module.  The 
cyclomatic  complexity  for  the  state  structure  is  p.  Each  of  these  factors  will  be  described  in 
detail. 
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Figure  2.  Entropy  Metric  Cycle. 


For  the  initial  highest  level  design  phase,  the  constant  c  would  be  relatively  large  due  to 
the  uncertainties  of  measurements  and  estimates  at  that  stage.  It  must  also  be  large  since  the 
number  of  measurable  factors  is  rather  small  yet  the  factors  which  can  be  measured  have  po¬ 
tential  for  high  impact  on  system  entropy.  In  the  later  stages  where  source  code  data  are  avail¬ 
able,  c  would  be  relatively  small.  This  is  the  case  because  more  factors  will  be  summed  and 
because  each  factor  has  potentially  less  relative  impact  on  system  entropy.  The  constant  c  is 
negative  for  the  test  phase  because  testing  will  decrease  the  system  entropy  indicating  an  in¬ 
crease  in  confidence  in  the  software.  The  entropy  of  a  system  will  never  be  zero  or  negative  so 
the  choices  of  the  constants  Cj  should  be  chosen  appropriately.  Each  time  new  design  or  revi¬ 
sions  are  made,  entropy  will  increase  and  will  have  to  be  brought  back  down  by  more  testing. 

The  factor  T  is  called  “influence”  and  is  computed  as 
T  =  rij  TTi  (wi  mij)  *. 

where  the  product  is  made  over  all  attributes  i  for  all  of  the  specific  entities  j  being  evaluated. 
Influence,  T,  is  intended  to  be  a  gauge  of  the  effects  of  encapsulation  and  abstraction  of  entities 
within  the  sphere  of  influence  of  the  module  being  considered.  The  entities  for  which  T  is  cal¬ 
culated  for  each  module  includes  all  variables,  data  structures,  subprograms,  and  external  utili¬ 
ties  referenced  by  the  given  module.  The  factor  wi  for  each  attribute  i  is  a  weighting  factor 
which  is  empirically  determined,  w  is  higher  for  attributes  which  have  higher  correlation  with 
module  success.  Success  in  this  sense  is  determined  by  faults  detected  in  systems  for  which 
data  has  been  collected.  The  exponent  x  is  -1  or  +1  depending  on  whether  the  attribute  is  ad¬ 
vantageous  or  detrimental  to  software  quality.  The  factor  m»  for  each  attribute  i  is  the  actual 
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measure  of  the  attribute.  The  attributes  and  their  measurement  methods  are  shown  in  Table  3. 
If  the  factor  i  does  not  apply  to  a  given  entity,  the  value  of  wi  mi  is  unity  so  that  T  is  not  ef¬ 
fected. 


TABLE  3.  Code  Characteristics  Factors. 


Attribute 

i 

X 

Measurement 

Scope/ 

1 

-  1 

count  modules  accessing 

the  scope  of  accessed 
entitlies 

Value  Range/ 

2 

+  1 

boolean  existence  of  specified  range 

is  the  range  of  value 

3 

-  1 

numberic  range  relative  to  median 

restricted  and  how 

4 

-  1 

enumeration  cardinality 

Access/ 

5 

-  1 

number  of  nodes  traversed  to  value 

difficulty  extracting 
an  item  from  a 
structure 

Coupling  Effect/ 

6 

-1 

value  =  1,  control  =  2,  rendezvous  =  3 

depth  of  coupling 
effect  of  the  item 

Exception  Scope/ 

7 

-  1 

count  subprogram  levels  to  resolve 

depth  of  resolution 
Hardware  Dependency/ 

8 

-  1 

boolean  (numeric  range,  type,  format) 

Initialization/ 

9 

+  1 

boolean 

initial  value  defined 
Scaling  Required/ 

10 

-  1 

float  =  1,  integer  =  5,  interface  «  20 

use  of  scale  factors 

Error  Estimate/ 

11 

-  1 

%  relative  numeric  error  introduced 

impact  of  calculation 
errors 

Loop  Termination/ 

12 

-  1 

boolean  yes  for  loop  terminators 

loop  control  effect 

Name  Length/ 

13 

+  1 

count  characters  in  name 

encourage  meaning 
in  naming  entities 

Requirement  Item/ 

14 

+  1 

boolean  existence  of  traceability 

defined  in  system 
specification 

Internal  Cohesion/ 

15 

-  1 

count  classes  of  data  items 

classes  of  data  controlled 
by  entity 
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TABLE  3.  Code  Characteristics  Factors  (continued). 


Type/ 

16 

+  1 

existence  and  variegation  booleans 

unique  type  declared 

Validity/ 

17 

+  1 

boolean:  provably  valid 

provably  valid  calculation 
for  value 

18 

-  1 

probability  of  invalid  result 

Volatility/ 

19 

-  1 

value  changes  per  operation  cycle 

changability  of  the 
entity 

20 

-  1 

changes  in  entity  definition 

Commentary/ 

21 

+  1 

characters  in  descriptive  comments 

The  factor  f,  called  “interface”,  measures  the  impact  of  the  module  through  inputs,  outputs 
and,  possibly  superfluously,  the  size  of  the  module.  This  factor  has  been  studied  and  validated 
to  a  certain  extent  [18].  It  is  calculated  as 

f  =  loc* (inputs* outputs)2  =  1  (io)2  , 

where  “loc”  is  the  number  of  lines  of  source,  “inputs”  is  the  number  of  inputs  to  the  module 
and  “outputs”  is  as  expected.  It  has  been  shown  that  loc  may  not  be  significant  [ibid]. 

The  factor  s,  called  “states”,  is  a  count  of  the  states  or  modes  resident  in  respective  mod¬ 
ules  of  software.  That  is 


s  =  number  of  states. 

This  measure  may  be  rather  subjective,  but  it  can  be  calculated  from  an  enumeration  typed 
variable  which  can  define  the  state  of  the  software  module.  This  factor  harks  back  to  the  usage 
of  machine  state  definitions  of  software  complexity  [14,  8,  15,]. 

The  factor  p,  called  “paths”,  is  the  log2  of  count  of  the  execution  paths  in  the  algorithms 
of  a  module.  For  example,  paths  may  be  counted  in  certain  types  of  algorithms  as  2b  where  b 
is  the  number  of  binary  branch  decision  statements  or  as  the  number  of  linear  independent  exe¬ 
cution  paths  in  the  Cyclomatic  Complexity  sense. 

During  the  testing  phase  of  software  development,  the  entropy  should  decrease.  This  hap¬ 
pens  through  the  combination  of  p,  the  number  of  paths  tested,  and  the  factor  Cj  being  negative. 
As  errors  are  discovered  in  testing,  entropy  terms  are  added  back  in  for  that  module  until  test¬ 
ing  can  bring  the  figure  back  down  to  the  normal  decreasing  trend  for  testing  phase. 

The  calculation  of  entropy  includes  the  factors  most  important  for  gauging  software  qual¬ 
ity  and  progress.  One  remaining  problem  with  the  use  of  this  entropy  as  a  metric  for  software 
quality  and  progress  is  that  it  is  not  an  absolute  quantity  and  a  relativity  base  must  be  estab¬ 
lished  for  given  types  and  sizes  of  applications.  A  graph  of  the  progress  of  entropy  trends  dur¬ 
ing  the  development  of  a  project  has  more  meaning  than  a  single  abstract  value  at  a  single 
point  in  time.  The  main  advantage  is  the  ability  to  track  software  as  it  becomes  more  refined. 
Comparisons  can  be  made  between  software  projects  which  use  the  metrics  consistently.  En¬ 
tropy  variability  throughout  the  software  project  gives  a  basis  for  intuitive  understanding  of 
progress.  Also,  reliability  and  fault  tolerance  issues  can  be  addressed  using  this  concept  of  en¬ 
tropy. 
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rv.  APPLICATION  OF  METRICS  TO  THE  ADA  ENVIRONMENT 


The  structure  of  the  language  Ada  contributes  particularly  well  to  the  use  of  software  en¬ 
gine  metrics.  The  elements  required  for  calculating  T,  s,  f,  and  p  are  explicit  in  Ada  program 
code.  Not  only  that,  software  engine  metrics,  through  the  mechanism  of  decreasing  entropy  in 
the  system,  can  encourage  the  use  of  constructs  provided  in  Ada  explicitly  for  increasing  the 
reliability  and  maintainability  of  software  [6]. 

A.  Mapping  to  the  Structural  Features  of  Ada 

Encapsulation  of  variables,  subprograms,  data  structures,  data  types,  etc.  are  explicit. 
Data  items  are  strongly  typed,  specific  in  range  or  by  enumeration,  initializable,  restricted  in 
scope,  and  over-loadable  with  explicit  exeception  handling.  Machine  dependencies  are  trace¬ 
able  through  types,  number  sizes,  and  pragmas.  Compileable  modules  are  explicit  in  packages 
with  the  ability  to  abstract  structures  and  subprograms  through  the  use  of  private  types.  Reuse 
is  encouraged  through  the  explicit  use  of  generic  subprograms.  Communication  of  parallel  pro¬ 
cedures  is  explicit  through  task  rendezvous.  Applications  of  the  software  engine  metrics  are 
more  difficult  with  other  languages  because  of  differences  in  available  compilers.  However, 
given  that  the  measure  collection  tool  is  compatible  with  the  language  compiler,  the  metrics 
can  be  collected  and  used  for  tracking  the  progress  and  quality  of  the  software  product  as 
shown  in  Figure  3. 

The  Module  Communication  Diagram  is  a  module  which  encapsulates  and  abstracts 
the  graphics  used  to  specify  the  highest  level  design  of  the  software  system.  It  operates  in  the 
earliest  phases  of  software  development,  requests  acceptance  of  data  and  sends,  to  the  System 
Structure  Database  and  the  Variable  Property  Database,  information  which  it  has  extracted 
from  the  design  graphics.  The  Variable  Property  Database  is  a  repository  for  information  about 
the  properties  of  variables  and  data  structures  in  the  system.  The  System  Structure  Database  is 
a  repository  for  information  about  the  structure  of  the  software.  The  Operator  is  an  external 
interface  which  allows  control  of  metric  gathering  system.  The  PDIVSource  Code  module  en¬ 
capsulates  and  abstracts  the  collection  of  data  from  the  program  design  language  and  program 
code.  Operating  in  the  software  design  and  coding  phases,  it  requests  access  and  sends  data  to 
the  Variable  Property  Database  and  the  System  Structure  Database.  The  Tests  module  per¬ 
forms  the  analogous  task  through  the  software  testing  phase  of  the  development.  The  Metric 
Repository  collects  data  from  the  Variable  Property  Database  and  the  System  Structure  Data¬ 
base,  calculates  statistics  for  metric  validation  and  computes  the  metrics.  The  results  are  output 
at  the  request  of  the  operator. 

Several  commercial  and  public  domain  data  collection  and  metric  calculation  systems 
are  available  such  as  LOGISCOPE  [35],  ADAMAT  [12],  and  tools  available  from  the  National 
Ada  Repository. 

B.  Development  of  a  Validation  Database 

The  Software  Engineering  Directorate  (SED)  of  the  U.S.  Army  Missile  Command’s 
Research,  Development  and  Engineering  Center  is  responsible  for  developing  guidelines  to  in¬ 
sure  that  software  acquired  for  missile  systems  is  maintainable.  The  loop  closes  when  SED 
becomes  responsible  for  the  maintenance  of  the  software.  In  this  configuration,  SED  has  ac¬ 
cess  to  actively  maintained  systems  and  can  collect  metrics  data  it  needs  for  acquisition.  Some 
research  has  been  done  by  SED  staff  in  this  field. 
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Figure  3.  Ada  Data  Extraction  System. 

The  scope  of  effort  for  a  useful  study  would  follow  a  standard  structure  for  such  projects. 
The  outline  of  tasks  appears  as  follows: 

1.  Design  systems  for  data  collection  regarding 

a.  design  attributes 

b.  code  structure 

c.  maintenance  hours  per  module. 

2.  Select  systems  for  analysis. 

3.  Collect  data  on  relevant  system  components  such  as 

a.  design  attributes 

b.  structure  attributes  of  code 

c.  personnel  hours  of  maintenance  per  module. 


4.  Correlate  design  and  structure  data  with  maintenance  data. 

5.  Analyze  and  summarize  experimental  results. 

6.  Write  and  publish  results 


7.  Design  application  systems  for  use  of  metrics,  in  acquisition  of  missile  system 
software. 

The  data  collection  and  analysis  effort  would  concentrate  on  the  parameters  required  for  valida¬ 
tion  of  the  Software  Engine  Metrics  model  as  well  as  standard  metrics  which  can  be  further 
validated.  Raw  data  should  be  formatted  in  a  manner  which  gives  high  resolution  access  to  the 
raw  measurement  process. 
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